379 research outputs found
Improving Data Quality Through Effective Use of Data Semantics
Data quality issues have taken on increasing importance in recent years. In our research, we have discovered that many “data quality” problems are actually “data misinterpretation” problems – that is, problems with data semantics. In this paper, we first illustrate some examples of these problems and then introduce a particular semantic problem that we call “corporate householding.” We stress the importance of “context” to get the appropriate answer for each task. Then we propose an approach to handle these tasks using extensions to the COntext INterchange (COIN) technology for knowledge storage and knowledge processing.Singapore-MIT Alliance (SMA
A Lightweight Ontology Approach to Scalable Interoperability
There are many different kinds of ontologies used for different purposes in modern computing. Lightweight ontologies are easy to create, but difficult to deploy; formal ontolgies are relatively easy to deploy, but difficult to create. This paper presents an approach that combines the strengths and avoids the weaknesses of lightweight and formal ontologies. In this approach, the ontology includes only high level concepts; subtle differences in the interpretation of the concepts are captured as context descriptions outside the ontology. The resulting ontology is simple, thus it is easy to create. The context descriptions facilitate data conversion composition, which leads to a scalable solution to semantic interoperability among disparate data sources and contexts
A Systems Theoretic Approach to the Security Threats in Cyber Physical Systems Applied to Stuxnet
Cyber Physical Systems (CPSs) are increasingly being adopted in a wide range of industries such as smart power grids. Even though the rapid proliferation of CPSs brings huge benefits to our society, it also provides potential attackers with many new opportunities to affect the physical world such as disrupting the services controlled by CPSs. Stuxnet is an example of such an attack that was designed to interrupt the Iranian nuclear program. In this paper, we show how the vulnerabilities exploited by Stuxnet could have been addressed at the design level. We utilize a system theoretic approach, based on prior research on system safety, that takes both physical and cyber components into account to analyze the threats exploited by Stuxnet. We conclude that such an approach is capable of identifying cyber threats towards CPSs at the design level and provide practical recommendations that CPS designers can utilize to design a more secure CPS
Studying the tension between digital innovation and cybersecurity
With increasing economic pressures and exponential growth in technological innovations, companies are increasingly relying on digital technologies for innovation and value creation. But, with increasing levels of cybersecurity breaches, the trustworthiness of many established and new technologies is of concern. Consequently, companies are aggressively increasing cybersecurity of their existing and new digital assets. Most companies have to deal with these priorities simultaneously which are frequently conflicting, and creating tensions. This paper introduces a framework for evaluating these risk/reward trade-offs. Through a survey and interviews, companies are positioned in different quadrants on an innovation/cybersecurity matrix overlaid with the negative impact of cybersecurity controls on the innovative projects. The paper analyzes the industry level, firm level, technology management, and technology maturity factors that affect these trade-offs. Finally, a set of recommendations is provided to help a company to evaluate its positioning on the matrix, understand the underlying factors, and how to better manage these trade-offs. Keywords: Cybersecurity, digital innovation, CIOs
Semantic Integration Approach to Efficient Business Data Supply Chain: Integration Approach to Interoperable XBRL
As an open standard for electronic communication of business and financial data, XBRL has the
potential of improving the efficiency of the business data supply chain. A number of jurisdictions
have developed different XBRL taxonomies as their data standards. Semantic heterogeneity
exists in these taxonomies, the corresponding instances, and the internal systems that store the
original data. Consequently, there are still substantial difficulties in creating and using XBRL
instances that involve multiple taxonomies. To fully realize the potential benefits of XBRL, we
have to develop technologies to reconcile semantic heterogeneity and enable interoperability of
various parts of the supply chain. In this paper, we analyze the XBRL standard and use examples
of different taxonomies to illustrate the interoperability challenge. We also propose a technical
solution that incorporates schema matching and context mediation techniques to improve the
efficiency of the production and consumption of XBRL data
Evaluating and Aggregating Data Believability across Quality Sub-Dimensions and Data Lineage
Data quality is crucial for operational efficiency and sound decision making. This paper focuses on believability,
a major aspect of data quality. The issue of believability is particularly relevant in the context of Web 2.0, where
mashups facilitate the combination of data from different sources. Our approach for assessing data believability is
based on provenance and lineage, i.e. the origin and subsequent processing history of data. We present the main
concepts of our model for representing and storing data provenance, and an ontology of the sub-dimensions of data
believability. We then use aggregation operators to compute believability across the sub-dimensions of data
believability and the provenance of data. We illustrate our approach with a scenario based on Internet data. Our
contribution lies in three main design artifacts (1) the provenance model (2) the ontology of believability subdimensions
and (3) the method for computing and aggregating data believability. To our knowledge, this is the first
work to operationalize provenance-based assessment of data believability
Addressing the Challenges of Aggregational and Temporal Ontological Heterogeneity
In this paper, we first identify semantic heterogeneities that, when not resolved, often cause serious data quality problems. We discuss the especially challenging problems of temporal and aggregational ontological heterogeneity, which concerns how complex entities and their relationships are aggregated and reinterpreted over time. Then we illustrate how the COntext INterchange (COIN) technology can be used to capture data semantics and reconcile semantic heterogeneities in a scalable manner, thereby improving data quality.Singapore-MIT Alliance (SMA
Reconciliation of temporal semantic heterogeneity in evolving information systems
The change in meaning of data over time poses significant challenges for the use of that data. These challenges exist in the use of an individual data source and are further compounded with the integration of multiple sources. In this paper, we identify three types of temporal semantic heterogeneity. We propose a solution based on extensions to the Context Interchange framework, which has mechanisms for capturing semantics using ontology and temporal context. It also provides a mediation service that automatically reconciles semantic conflicts. We show the feasibility of this approach with a prototype that implements a subset of the proposed extensions
Measuring Data Believability: A Provenance Approach
Data quality is crucial for operational efficiency
and sound decision making. This paper focuses on
believability, a major aspect of quality, measured
along three dimensions: trustworthiness,
reasonableness, and temporality. We ground our
approach on provenance, i.e. the origin and
subsequent processing history of data. We present our
provenance model and our approach for computing
believability based on provenance metadata. The
approach is structured into three increasingly complex
building blocks: (1) definition of metrics for assessing
the believability of data sources, (2) definition of
metrics for assessing the believability of data resulting
from one process run and (3) assessment of
believability based on all the sources and processing
history of data. We illustrate our approach with a
scenario based on Internet data. To our knowledge,
this is the first work to develop a precise approach to
measuring data believability and making explicit use of
provenance-based measurements
Context Interchange as a Scalable Solution to Interoperating Amongst Heterogeneous Dynamic Services
Many online services access a large number of autonomous data sources and at the same time need to meet different user requirements. It is essential for these services to achieve semantic interoperability among these information exchange entities. In the presence of an increasing number of proprietary business processes, heterogeneous data standards, and diverse user requirements, it is critical that the services are implemented using adaptable, extensible, and scalable technology. The COntext INterchange (COIN) approach, inspired by similar goals of the Semantic Web, provides a robust solution. In this paper, we describe how COIN can be used to implement dynamic online services where semantic differences are reconciled on the fly. We show that COIN is flexible and scalable by comparing it with several conventional approaches. With a given ontology, the number of conversions in COIN is quadratic to the semantic aspect that has the largest number of distinctions. These semantic aspects are modeled as modifiers in a conceptual ontology; in most cases the number of conversions is linear with the number of modifiers, which is significantly smaller than traditional hard-wiring middleware approach where the number of conversion programs is quadratic to the number of sources and data receivers. In the example scenario in the paper, the COIN approach needs only 5 conversions to be defined while traditional approaches require 20,000 to 100 million. COIN achieves this scalability by automatically composing all the comprehensive conversions from a small number of declaratively defined sub-conversions.Singapore-MIT Alliance (SMA
- …